Probabilistic Spatial Filter Estimation for Signal Enhancement in Multi-Channel Automatic Speech Recognition

نویسندگان

  • Hendrik Kayser
  • Niko Moritz
  • Jörn Anemüller
چکیده

Speech recognition in multi-channel environments requires target speaker localization, multi-channel signal enhancement and robust speech recognition. We here propose a system that addresses these problems: Localization is performed with a recently introduced probabilistic localization method that is based on support-vector machine learning of GCC-PHAT weights and that estimates a spatial source probability map. The main contribution of the present work is the introduction of a probabilistic approach to (re-)estimation of location-specific steering vectors based on weighting of observed inter-channel phase differences with the spatial source probability map derived in the localization step. Subsequent speech recognition is carried out with a DNN-HMM system using amplitude modulation filter bank (AMFB) acoustic features which are robust to spectral distortions introduced during spatial filtering. The system has been evaluated on the CHIME-3 multichannel ASR dataset. Recognition was carried out with and without probabilistic steering vector re-estimation and with MVDR and delay-and-sum beamforming, respectively. Results indicate that the system attains on real-world evaluation data a relative improvement of 31.98% over the baseline and of 21.44% over a modified baseline. We note that this improvement is achieved without exploiting oracle knowledge about speech/non-speech intervals for noise covariance estimation (which is, however, assumed for baseline processing).

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A Novel Frequency Domain Linearly Constrained Minimum Variance Filter for Speech Enhancement

A reliable speech enhancement method is important for speech applications as a pre-processing step to improve their overall performance. In this paper, we propose a novel frequency domain method for single channel speech enhancement. Conventional frequency domain methods usually neglect the correlation between neighboring time-frequency components of the signals. In the proposed method, we take...

متن کامل

Revereberation Reduction for Improved Speech Recognition

In this paper we present a dereverberation algorithm for improving automatic speech recognition (ASR) results with minimal CPU overhead. As the reverberation tail hurts ASR the most, late reverberation is reduced via gain-based spectral subtraction. We use a multi-band decay model with an efficient method to update it in realtime. In reverberant environments the multi-channel version of the pro...

متن کامل

A multi-channel speech enhancement framework for robust NMF-based speech recognition for speech-impaired users

In this paper a multi-channel speech enhancement framework for distant speech acquisition in noisy and reverberant environments for Non-negative Matrix Factorization (NMF)-based Automatic Speech Recognition (ASR) is proposed. The system is evaluated for its use in an assistive vocal interface for physically impaired and speech-impaired users. The framework utilises the Spatially Pre-processed S...

متن کامل

Overlapping Frame Approach to Estimate and Reduce Noise from Single Channel Speech

Speech enhancement is a long standing problem with various applications like telephone conversation and speech recognition. The corruption of speech due to presence of additive background noise causes severe difficulties in various communication environments. If the background noise is evolving more slowly than the speech, then the estimation of the noise during speech pauses is easier as compa...

متن کامل

Two-pass Quantile Based Noise Spectrum Estimation

Noise spectrum estimation from a noisy speech signal forms a critical part of such applications as single channel speech enhancement and robust automatic speech recognition (ASR). The two-pass quantile based noise estimation algorithm presented in this paper has the ability to track slow changing non-stationary noise and obtains good estimates for various noise types over a wide range of SNR le...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2016